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Abstract 

Higher order tensor inversion is possible for even order. We have shown that a tensor group endowed 
with the Einstein (contracted) product is isomorphic to the general linear group of degree n. With the 
isomorphic group structures, we derived new tensor decompositions which we have shown to be related 
to the well-known canonical polyadic decomposition and multilinear SVD. Moreover, within this group 
structure framework, multilinear systems are derived, specifically, for solving high dimensional PDEs 
and large discrete quantum models. We also address multilinear systems which do not fit the framework 
in the least-squares sense, that is, when the tensor has an odd number of modes or when the tensor has 
distinct dimensions in each modes. With the notion of tensor inversion, multilinear systems are solvable. 
Numerically we solve multilinear systems using iterative techniques, namely biconjugate gradient and 
Jacobi methods in tensor format. 

Keywords: tensor and matrix inversions, multilinear system, tensor decomposition, least-squares 
method, 

1 Introduction 

Tensor decompositions have been succesfully applied across many fields which include among others, chemo- 
metrics [35] , signal processing [51 [T3J and computer vison [1^ . More recent applications are in large-scale 
PDEs through a reduced rank representation of operators with applications to quantum chemistry |28j 
and aerospace engineering [19 . Beylkin and Mohlenkamp [3J [4] used a technique called separated rep- 
resentation to obtain a low rank representation of multidimensional operators in quantum models; see 
[3111]. Hackbusch, Khoromskij and Tyrtyshnikov 22, [23] have solved multidimensional boundary and eigen- 
value problems using a reduced low dimensional tensor-product space through separated representation 
and hierarchical Kronecker tensor from the underlying high spatial dimensions. See the survey papers 
[321 Ell [29] and the references therein for more applications and tensor based methods. Extensive studies 
(e.g. [TOl [12l HH [30]) have exposed many aspects of the differences between tensors and matrices despite 
that tensors are multidimensional generalizations of matrices. 

In this paper, we continue to investigate the relationship between matrices and tensors. Here we address 
the questions: when is it possible to matricize (tensorize) and apply matrix (tensor) based methods to high 
dimensional problems and data with inherent tensor (matrix) structure. Specifically, we address tensor 
inversion through group theoretic structures and by providing numerical methods for specific multilinear 
systems in quantum mechanical models and high-dimensional PDEs. Since the inversion of tensor impinges 
upon a tensor-tensor multiplication definition, the contracted product for tensor multiplication was chosen 
since it provides a natural setting for multilinear systems and high-dimensional eigenvalue problems consid- 
ered here. It is also an intrinsic extension of the matrix product rule. Still other choices of multiplication 
rules could be considered as well for particular application in hand. For example, in the matrix case, there 
are the alternative multiplication of Strassen [42] which improves the computational complexity by using 
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block structure format and the optimized matrix multiplication based on blocking for improving cache per- 
formance by Demmel [TH]. In a recent work of Van Loan the idea of blocking are extended to tensors. 
Our choice of the standard canonical tensor-tensor multiplication provides a useful setting for algorithms 
for decompositions, inversions and multilinear iterative solvers. 

Like tensors, multilinear systems are ubiquitous since they model many phenomena in engineering and 
sciences. In the field of continuum physics and engineering, isotropic and anisotropic elastic models |34| 
are multilinear systems. Multilinear systems are also prevalent in the numerical methods for solving partial 
differential equations (PDEs) in high dimensions, although most tensor based methods for PDEs require 
a reduction of the spatial dimensions and some applications of tensor decomposition techniques. Here we 
focus on the iterative methods for solving the Poisson problems in high dimension in a tensor format. 
Tensor representations are also common in large discrete quantum models like the discrete Schrodinger and 
Anderson models. The study of spectral theory of the Anderson model is a very active research topic. The 
Anderson model £Q, Anderson's celebrated and ultimately Nobel prize winning work is the archetype and 
most studied model for understanding the spectral and transport properties of an electron in a disordered 
medium. Yet there are still many open problems and conjectures for high dimensional d > 3 cases; see 
[2"51 13T1 |4"T] and the references therein. The Hamiltonian of the discrete Schrodinger and Anderson models 
are tensors with an even number of modes; they also satisfy the symmetries required in the tensor SVD wc 
described. Moreover, computing the eigenvectors to check for localization properties not only demonstrate 
the efficacy of our algorithms, but it actually gives some validation and provide some insights to some of 
the conjectures [25l [3TJ [41] . Recently, Bai et al. [2] have solved some key questions in quantum statistical 
mechanics numerically. For instance, they have developed numerical linear algebra methods for the many- 
electrons Hubbard model and quantum Monte Carlo simulations. Numerical (multi)linear algebra techniques 
are increasingly becoming useful tools in understanding very complicated models and very difficult problems 
in quantum statistical mechanics. 

The contribution of this paper is three- fold. First, we define the tensor group which provides the frame- 
work for formulating multilinear systems and tensor inversion. Second, we discuss tensor decompositions 
derived from the isomorphic group structure and relate them to the standard tensor decompositions, namely, 
canonical polyadic (CP) [71 [23] and multilinear SVD decompositions [321 [331 [33 E] ■ We have shown that the 
tensor decompositions from the isomorphic properties are special cases of the well-known CP and multilin- 
ear SVD with symmetries while satisfying some conditions. Stegeman |39[ I40j extended Kruskal's existence 
and uniqueness conditions for CP decomposition for cases with various forms of symmetries (i.e. existence 
of identical factors). These decompositions appear in many signal processing applications; e.g. see [9] and 
the references therein. When the tensor has the same dimension in all modes, the tensor eigenvalue decom- 
position in Section 3 is the tensor eigendecomposition described by De Lathauwer et al. in [16] which is 
prevalent in signal processing applications, namely in, the blind identification of underdetermined mixtures 
problems. Last, we describe multilinear systems in PDEs and quantum models. We provide numerical 
methods for solving multilinear systems of PDEs and tensor eigenvalue decompositions for high dimensional 
eigenvalue problems. Multilinear systems which do not fit in the framework are addressed by providing 
pseudo-inversion methods. 

2 Preliminaries 

We denote the scalars in K with lower-case letters (a, b, . . .) and the vectors with bold lower-case letters 
(a, b, . . .). The matrices are written as bold upper-case letters (A, B, . . .) and the symbol for tensors are 
calligraphic letters (A,B,...). The subscripts represent the following scalars: {A) i: j k = (A)jj = ay, 
(a)j = Oi. The superscripts indicate the length of the vector or the size of the matrices. For example, is 
a vector with length K and J$ NxK is a N x K matrix. In addition, the lower-case superscripts on a matrix 
indicate the mode in which has been matricized. 

The order of a tensor refers to the cardinality of the index set. A matrix is a second-order tensor and a 
vector is a first-order tensor. 

Definition 2.1 (even and odd tensors) Given an Nth tensor T <G ^ixi 2 x...xZw _ If N is even (odd), 
then T is an even (odd) Nth order tensor. 
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Definition 2.2 (Einstein product [20]) For any N, the Einstein product is defined by the operation *n 



via 



(A.*N $)i 1 ...i N k N + 1 ...k M — ^ ^ a i 1 i 2 ...i N k 1 ...k N bk 1 ...k N k N+1 k N+2 ...k M ■ (2-1) 

k\ . . .kpj 

where A £ T Ilt ...j N!Kli .„ tKN (M.) and B € T Ki ,... jKn>Kn+u ... iKm (K). 

For example, if T, 5 £ jj-fx Jx/xj^ p era tion * 2 is defined by the following: 

/ ,/ 

(T*2S) ij y = J2J2 t ^ S uvlr ( 2 -2) 

1 V— 1 

The Einstein product is a contracted product that it is widely used in the area of continuum mechanics 
[33] and ubiquitously appears in the study of the theory of relativity [2D] . Notice that the Einstein product 
*i is the usual matrix multiplication since 

K 

mikn k j = (MN)y (2.3) 

k=l 

for M £ R IxK ,N £ R KxJ . 

Definition 2.3 (Tucker mode-n product) Given a tensor T £ jgJxJxif an g ^ e ma i r { ces A e M /x/ , 
B £ R JxJ and C € M. KxK , then the Tucker mode-n products are the following: 

i 

(T »i A)- i j k = Y i t ijka~ ii , ^i,j,k (mode- 1 product) 
j=i 
j 

(T«2B)- i fc = 2Z<ijfc6jj, Vj,«, fc (mode- 2 product) 

i=i 

(T^Cj^j = Y / t ijkC i , k , yk,i,j (mode- 3 product) 



Notice that the Tucker product •„ is the Einstein product *i in which the mode summation is specified. 
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Figure 1: Matrix representation of S £ M. IlXl2XlsXli where I 1 = 3,I 2 = 3,/ 3 = 7, 1 4 = 7 with 3x3 
matrix slices. There are 7 • 7 total 3x3 matrix slices. Here are nine matrix slices with the indices fixed 
at (i 3 ,i 4 ): Sg^ )U=8J S^ 3;ii=4 , Sg^ (i4=6 (top row, right), S^]^, S^ 4i4=4 , S^\ i4=5 (middle row, 

j(3,4) 0(3,4) q(3,4) 



right), S^ )i4=3 ,S^ iU=4 ,S^ jU=5 (bottom row, right) 



The definitions below describe the representation of higher-order tensors into matrices. 
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Definition 2.4 (Matrix and subtensor slices) A third-order tensor S G ^ixJxK /j as three types of 
matrix slices obtained by fixing the index of one of the modes. The matrix slices of S £ J^ IxJxK are 
following: Sj =a G R JxK with fixed i = a, Sj =a £ R IxK with fixed j = a and S| =Q G R IxJ with fixed 
k = a. For a Nth-order tensor S G K / i x7 2x/ 3 x...x/ N ^ ^ e suo tensors are the (N — l)th-order tensors denoted 
by <S"_ a G K / i x/ 2x7 3 x...x/„_ix7„ + ix...x7 J v w /jj c /j are obtained by fixing the index of the nth mode. 

Definition 2.5 (Matrix slices with several indices fixed) A fourth-order S G x -r 2 x j 3 x z 4 ^ as s ^ x 

types of matrix slices by fixing two indices. A matrix slice of S £ R Ix x 12 x Is x 14 is S^^ aii _g G R /lX/2 with 
«3 = a and 14 = j3 fixed. In general, for any Nth order tensor S G jj / i x -f2x7 3 x...xi N ^ there are (iy_ 2 ) different 
matrix slices by holding JV-2. A matrix slice ofS G rJix^xz 3 x...xz„ j g(3,4 ,...,JV) g M 7lX/2 witt 

it J i 3 — a 3 ,14 — Q4 , . . . ,2jV — ct^v 

indices is,...,i n fixed. The subscripts (3,4) and (3,4, . . . , N) indicate which indices are fixed. Moreover, 



(<5)iii 2 i 3 U — (S 



(3,4V 
i 3 ,i 4 ■ 



This definition is different from Definition 2.4 since several indices are fixed at a time; see Figure 1. The 
matrix representation in Figure 1 is consistent with the matrix representation in Matlab where the last 
N — 2 indices are fixed. 



3 Tensor Group Structure and Decompositions 

For the sake of clarity, the main discussion is limited to fourth-order tensors, although all definitions and 
theorems hold for any even high-order tensors. Here a group structure on a set of fourth order tensor 
through a push-forward map on the general linear group is defined. Also several consequential results from 
the group structure will be discussed. 

Definition 3.1 (Binary operation) A binary operations on a setG is a rule that assigns to each ordered 
pair (.A, 23) of elements of G some element ofG. 

Definition 3.2 A group (G, *) is a set G, closed under a binary operation *, such that the following axioms 
are satisfied: 

(Al) The binary operation * is associative; i.e. (A * 23) * C — A * (23 ★ C) for A, 23, 6 G G. 

(A2) There is an element £ G G such that £ * X = X * £ for all X G G. This element £ is an identity 
element for * on G. 

(A3) For each A £ G, there is an element A G G with the property that A*A = A*A = E,. 

Definition 3.3 (Transformation) Let A G ¥7^7^/^/2(11) and A G Mj 1 j 2 j 1 ] 2 (R) . Then the transforma- 
tion f : Tj 1 j 2 j 1 j 2 (R) — > M.i 1 i 2 j 1 i 2 (R) with f(A) — A is defined component-wise as 

(A)i li2 i li2 -» (A)[i 1+ ( l2 _i)/ 1 ][. il+ (i 2 _i)7 l] (3.1) 
If A G T Jl)/2i/3) / 4 (M) and A G M Ill2 j 3h (R) , then 

Moreover, the transformation is 

(■A)i 1 i 2 ...i N j 1 h...j N -> ( A \h+i:^ 2 (i k -i-)Yii= 1 1 ^[ji+E^uowmfc! 1 Ji] ( 3 - 3 ) 

when A G T Iu ..,j Nt j u ... !jN (M.) and A G M Ii ... In>Ji ,„j n (R). 

These transformations which are known as column (row) major format in many computer languages 
are typically used to enhance efficiency in accessing arrays. This mapping ( |3.1[ ) is commonly used in 
matricization of fourth order tensors in signal processing applications; e.g. see |16j . 
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Lemma 3.4 Let f be the map defined in (3.1). Then the following properties hold: 



1. The map f is a bijection. Moreover, there exists a bijective inverse map f 1 : M/ 1 / 2) j 1 j 2 (R) — > 

2. The map satisfies f(A * 2 B) = f(A) ■ f(B) where '■ ' refers to the usual matrix multiplication 
Proof. 

(1) According to the definition of /, we can define a map h : I\ X I2 — > I\I 2 by h(i 1 ,i 2 ) = ii + (i 2 — 1)A 
where Ik = {1, . . . , Ik} and Ikh = {1, . . . , Ikh}- Clearly, the map h is a bijection so it follows / is a 
bijection. 

(2) Since / is a bijection, for some 1 < i, 3 ' < I\I 2 , there exists unique ii,ii,ji,j2, for 1 < i\,j\ < I\, 1 < 
«2, 32 < h such that (i 2 - l)Ii +h = i, and (j 2 - l)ii + ji = j. So, 



[/(•^ *2 B)]ij — (J\. *2 B)i 1 i 2 j 1 j 2 — ^ ' o,i 1 i 2UV b UV j 1 



j 2 



{f(A)-f(B)}, u = J2if(-A)Uf(B)} ri . 

r=l 

For every 1 < r < I1I2, there exists unique u, v such that (u — + v = r. So, 

^ ^ a i 1 i 2 uvbuvj 1 j2 = ^ ' [/(-^)]ir[/(^)]rj- 
r— 1 



□ 



It follows from the properties of / that the Einstein product (2.2) can be defined through the transfor- 
mation: 

A* 2 B = f-'ifiA * 2 B)] = f-'lHA) ■ f(B)]. (3.4) 
Consequently, the inverse map f~ l satisfies 

r i (A-B) = r i (A)* 2 r i (B). (3.5) 

Recall that a subset of M^-^r(M) consisting of all invertible N x N matrices with the matrix mul- 
tiplication is a group. Let the subset M C M.j 1 j 2 j 1 j 2 (M) contain all invertible Iil 2 x -Zi/2- Define 
T = {T e T Iul2jlj2 {R) : det(f(T)) ^ 0} where T e M with T = f(T). 

Theorem 3.5 Suppose (M, •) is a group. Let f : T — > M be any bijection. Then we can define a group 
structure on T by defining 

A* 2 B = f- 1 [f(A)-f(B)] 

for all A,BgT. In other words, the binary operation * 2 satisfies the group axioms. Moreover, the mapping 
f is an isomorphism. 

The proof is straightforward as it is in the matrix analogue. See the details of the proof in the Appendix. 
Moreover, the group structure can be cast a ring isomorphism. Further discussion or requirement of the 
ring structure is not needed hereafter so we do not include the proof. 

Corollary 3.6 Define T = {T G T Iu .„j N> j u ...j N (R),det(f(T)) ^ 0} with I m = J m for m = 1, . . . ,N. 



Then the ordered pair T is a group where the operation *n is the generalized Einstein product in (2.1) 



Proof. The generalization of the transformation / (3.3 1 on the set T with the binary operation *jv easily 



provide the extension for this case. □ 



() 



Theorem 3.7 The ordered pair (T, *n) where T = {T £ T/j^.^j^ J is not a group under the operation 

Proof. Take N = 2. Then T*2<S ^ T where T,S £ r ^i 1 ,i 2 ,h- ^ follows that T is not closed under * 2 . Thus 
the ordered pair (T, *a) is not a group. □ 



Theorem (3.7) implies that odd order tensors have no inverses with respect to the operation *jv, although 



such binary operation may exist in which the set of odd order tensors exhibits a group structure. Lemma 



(3.4) and Theo rem (3.5) show that the transformation / is an isomorphism between groups T and M. From 
Corollary (3.6 1, it follows that these structural properties are preserved for any ordered pair (T, *jv) for any 
N. Thus in the following Section 3.2, some properties and applications of the tensor group structure are 
addressed. 

Tensors T <E Tj 1 j 2 j 3 j i (M.) with different mode lengths which are similar to rectangular matrices have 
no inverses under *2- In Section 6, we discuss pseudo-inverses for odd order tensors and even order tensors 
with distinct mode lengths. 

3.1 Decompositions via Isomorphic Group Structures 



Theorem |3.5| implies that (T, +2) is structurally similar to (M, ■). Thus we endow (T, +2) with the group 
structure such (T, * 2 ) and (M, •) are isomorphic as groups. This section discusses some of the definitions, 
theorems and decompositions preserved by the transformation. 

Definition 3.8 (Transpose) The transpose of S £ R /xJx/xJ i s a tensorT which has entries tijki — Skiij- 
We denote the transpose of S as T = S T . If S e U IiXI2 " xI » xJiX - xJn , then (T)i u i 2 ,..., in ,j uj2 ,..., jn = 
( S )Ji,j2,...,jn,ii,iz,-,in is the tran spose of S . 

Definition 3.9 (Symmetric tensor) A tensor S € ^ixi2--.xi N x J±x...x J N j s S y mme t r i c if $ = S T , that 

IS, Sii,i 2 ,...,i n ,ji,j2,----jn = s ji,j2,--,3n,i\,i2,—,in' 

Definition 3.10 (Orthogonal tensor) A tensor U £ R /xJx/xJ { s orthogonal ifU T *2 U = I where I is 
the identity tensor under the binary operation *2- 

Definition 3.11 (Identity tensor) The identity tensor 8 is 



1, l = k 

0, l^k. 



N 



where 

8ik = 

It generalizes to an 2Nth order identity tensor, 

C^Oiiia— ijviii2— j'jv = J"J 8i k j h - (3-6) 

fe=i 

Definition 3.12 (Diagonal tensor) A tensor!) £ R IxJxIx ' 7 is called diagonal if dijki = when i^k 
and j ^= I. 

The diagonal tensor T> £ K 7 i x - x/ « m tensor decompositions like in Parallel Factorization and Canonical 
Decomposition[71 124j has nonzero entries (ii 1 ,i 2 ,... > i n when i-y = . . . = iff. Definition |3. 12 has in general more 



non-zero entries than the usual definition. This definition is consistent with the identity tensor (|3.6|, that 
is, the diagonal and the identity tensors have nonzero entries on the same indices. 
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Theorem 3.13 (Singular value decomposition (SVD)) Let A G jj^xjx/xj njith R = rank(f(A)). 
The singular value decomposition for tensor A has the form 

A = U * 2 V * 2 V T (3.7) 

where U G ^ixJxix-J an d y g R /xJx/xJ are orthogonal tensors and V G K /x Jxlxj i s a diagonal tensor 
with entries o~ijij called singular values. Moreover, the decomposition [3.1) can be written as 

^ = ££^Ku£f^°(V^, (3-8) 
a sum of fourth order tensors. The matrices U^' 4 ^ and V^' 4 ^ are called left and right singular matrices. 



The symbol o denotes the outer product where Aijki — By o Cm = ByCjy. Recall from Definition 2.5 that 
U^' 4) and vg' 4) are matricizations of fourth order tensors U and V, respectively. 



Proof. Let A = f(A). From the isomorphic property (3.5 1 and Theorem 3.5 we have 

A = U D V T 1— A = U * 2 V * 2 V T 
where U and V are orthogonal matrices and D is a diagonal matrix. In addition, U • U T = I and 
V ■ V T = I ^— >■ U T * 2 U = 1 and V T * 2 V = X. □ 

Theorem 3.14 (Eigenvalue decomposition(EVD) for symmetric tensor) Let A G ^ixJxlxJ an( j 
R = rank(f(A)). A is a real symmetric tensor if and only if there is a real orthogonal matrixV G R IxJxIxJ 
and a real diagonal matrix T> G m IxJxIxJ such that 

A = V * 2 f> *2 V T (3.9) 

where V G R IxJxIxJ { s an orthogonal tensor and T> G R IxJxIxJ j s a diagonal tensor with entries o^mj 
called eigenvalues. Moreover, the decomposition (3.51) can be written as 



kl 



a sum of fourth order tensors. The matrix P^' 4 '' G M /X " r is called an eigenmatrix. 



Proof. From the isomorphic property (3.5) and Theorem 3.5 we obtain that there exist some orthogonal 
matrix V and diagonal T> such that A — V * 2 T> * 2 V T . Moreover, the fourth order tensor P^%i = ° 
(P^h is symmetric since V iiVi = EutlPkl^h ° = EaA^mV^, = £ s P« • = 



'1J IJIJ « ' J v «' 'V ^1D,%3 'J" 1 ijkl 

■> ..T>„ T>T - (-p(3,4)N Cp(3,4)v. _ p.. 

r ijkl' ijkl ~ \ r kl Jij u \ r kl >V ~ r ijij' 



V P- -P T -T -V- V T - CP (3 ' 4) V- o CP (3 ' 4) V- - v~~ r 

Z^s r rs r rs — l^ijij I ijkl 1 ijkl — \ r kl hi u \ r kl >V ~ ' Hii' LJ 



Remark 3.15 If the eigenmatrix (P^' 4 -*) is symmetric, that is, (Pjy' 4 ^)r/ = (Pw'^)jij then the entries of 
A has the following symmetry: ajuk = a-ijkl- If <ijilk = &ijki and aijki = CLklij; then (3.10) is exactly the 
tensor eigendecomposition found in the paper of De Lathauwer et al. [16] when I = J . The fourth order 
tensor in \16§ is a quadricovariance in the blind identification of underdetermined mixtures problems. 

3.2 Connections to Standard Tensor Decompositions 

In 1927, Hitchcock [23 HZ] introduced the idea that a tensor is decomposable into a sum of a finite number 
of rank-one tensors. Today, we refer to this decomposition as canonical polyadic (CP) tensor decomposition 
(also known as CANDECOMP [7] or PARAFAC [23]). CP is a linear combination of rank-one tensors, i.e. 

R 

T = ^""^ a r ° b r o c r o d r (3-11) 
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where T G R IxJxKxL , a r € R 1 ,b r € R J , c r £ R K and d r G R L . The column vectors a r , 6 r , c,. and d r form 
the so-called factor matrices A, B, C and D, respectively. The tensorial rank [37] is the minimum R E N 
such that T can be expressed as a sum of R rank-one tensors. Moreover, in 1977 Kruskal [32] proved that 
for third order tensor, 

2R + 2 < rank(A) + rank(B) + rank(C) 

is the sufficient condition for uniqueness of T = 5Zr=i a r ob r o c r up to permutation and scalings. Kruskal's 
uniqueness condition was then generalized for n > 3 by Sidiropoulous and Bro |38j : 



2i? + (n - 1) < rank{A {3) ) 

3 = 1 



(3.12) 



for T = J2r=l a r ° • • • ° ttr • 

Another decomposition called Higher-Order SVD (also known as Tucker and Multilinear SVD) was 
introduced by Tucker [43] |44j |45j [14] in which a tensor is decomposable into a core tensor multiplied by a 
matrix along each mode, i.e. 



T = 5»i A» 2 B» 3 C» 4 D 



(3.13) 



where T,S G R jxjx ^ xi are fourth order tensors with four orthogonal factors A G K /x/ , B g R jxj , 
C G M KxK and D G R LxL . Note that CP can be viewed as a Tucker decomposition where its core tensor 
S G R IxJxKxL i s diagonal, that is, the nonzeros entries are located at (S)uu. 
The tensor SVD 13/7] can be viewed as CP and multilinear SVD. 



Lemma 3.16 Let T G R- rxJ x- rxJ and R = rank(f(T)). The tensor SVD ETty in Theorem 3J% is equiv- 
alent to CP \3.1l\ if there exist A G R IxIxJ ,B G R JxIxJ , C G R lxlx ^and V G R^^sucA that 
a lk ibjki = u ijk i and c lkl d- jkl = v Vjkl . 

Proof. Define r = k + {I - 1)1. Then 



r=l 



where a k m = a rr . Since u ijk i 
it follows that 



( U w 4) )ij = (Ur*%- = Oir&jr and 



t(3,4) 



R 



4h 



(V< 3 ' 4) )y 



E^,(U( 3 - 4 )) y - o (V^ 4 ))^ =J2*rra ir b jr q r d 3r 



r=l 



r=l 



r=l 



Then, 



T 



(3.14) 



r=l 



Moreover, the factor matrices A G R IxIJ , B G M ,/x/J , C G R IxU and D G R jx/J are built from concate- 
nating the vectors a r , b r , c r and d r , respectively. □ 



Remark 3.17 To satisfy existence and uniqueness of a CP decomposition, the inequality (3.12) must hold; 
i.e. 2/J + 3 < rank(A) + rank (H) + rank (C) + rank (D). If R = rank(f(A)) = I J, then the decomposition 
3.14) does not satisfy [ 3.12\ . However, if f{T) is sufficiently low rank, that is, R — rank(f(T)) < I J 



for some dimensions I and J, then \3.12 ) holds. Futhermore, the existence of the factors A, B, C and D 
requires that the matricizations,\J^ k ,^ and vSy 4 ', to be rank-one matrices. 
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Lemma 3.18 Let T G 



plxjxlxj 



alent to multilinear SVD (3.13) if there exist A G 
a-ikbji = Uijki and c ik dji = v ijkl . 



and R — rank(f(A)). The tensor SVD (3.1) in Theorem 3.13 is equiv- 



plxl 



B g 



JxJ 



, C G 



and D G 



such that 



Proof. From d3.8b, we have 7^ = J2u °"Wfei( U w ^h^kT'hj which implies = £ fei o klk ia lk bjic lk d' jV 
□ 



r(3,4)x 



Remark 3.19 Typically, the core tensor of a multlinear SVD (3.13) is dense. However, the core tensor 
resulting from Lemma 3.18 is not dense (possibly sparse); i.e. there are IJ nonzeros elements out of I 2 J 2 



entries in the fourth order core tensor of size L x J x I x J . Similarly, the existence of the decomposition 
impinges upon the existence of the factors A G M 7x/ ,B G R jxj , C G R lxl and D G R jxj such that 
U = Ao B oWV = CoD. 



ixj xixj j s S y mme i r i c an d ji = rank(f(A)) . The tensor EVD 3.10 in Theorem 
3.1 A is equivalent to CP fSJll) if there exist A G R IxIxJ , B G R JxIxJ such that a ik ib-i 



Corollary 3.20 LetT G 



MklOjkl = Pijkl- 



Corollary 3.21 Let T G R 1 x J x 1 x J_with symmetries tij k i = tkUj and tjiki — tijki with R = rank(f(A)). 



The tensor EVD 



3.10 



m Theorem 
such that aikib 3 ki = Pijkl- 



3.14 is equivalent to CP (3.11) if there exist A G 



vlxlxj 



B G 



\JxIxJ 



Remark 3.22 The CP decomposition from Corollary 3.20 is 7~ ^ = X)r=i °yr(Pr 3 ' 4 ) 



with identical factors: A = C and B = D from Lemma 



the existence of the factors A and B requires that the matricization, P 



Corollary 
Thus, (P kl 



3.21 



(3,4) 
kl 



3.16 



As in Remark 3.17, 



to be rank-one matrices. In 



the added symmetry tj ik i = tij k i implies that P^' 4 ^ is symmetric (as well as rank-one) 



)ij — aikiajki 
decomposition [101 . 



T = J2 r =i (T rrO'irO'jrO J ~ ir aj r . This decomposition is known as symmetric CP 



Cor ollar y 3.23 LetT G 
rem 



plxjxl> 



J is sy mmetr ic and R = rank(f(Aj) . The tensor EVD (3.10) in Theo- 
3.14 is equivalent to multilinear SVD (3.13) if there exists A G R IxIxJ such that aikb^i = 



a-ikOjl — Pijkl- 



Remark 3.24 The multilinear SVD from Corollary (3.23) is T-g 



J2h Vkikidikbjia^b^ following Lemma 3.18 



Y,kl VklkliPki^hj 



o (Pit 4) ) 



4 Multilinear Systems 

A multilinear system is a set of M equations with TV unknown variables. A linear system is a multilinear 
system which is conveniently expressed as Ax = b where A G R MxN ,x G and b G R M . Similarly, 
A * 2 X = B has M = I ■ J ■ K ■ L equations and N = R ■ S ■ K ■ L unknown variables if A G M. IxJxRxS , X G 
r rxSxKxl and B e M /xjxifxL Equivalently, we define a linear transformation £ : R N — > E M such that 
£(x) = Ax with the property £(cx + dy) = c£(x) + <i£(y) for some scalars c and d. A bilinear system is 
defined through 03 : R AI x R N — > K with 03(x, y) = y T Bx = B »i x » 2 y where B G R MxN . The bilinear 
map has the linearity properties: 

! B(cx 1 +dx 2 ,y) = c < B(x l! y)+d ( 8(x 2 ,y) 
and 

Q3(x,cy 1 + dy 2 ) = c«8(x,yi)+dQ3(x,y 2 ) 

for some scalars c, c, d, d and vectors x, xi, x 2 G R N , y, yi, y 2 G R M . 

We can define multilinear transformations M. : Rf 1 x ' ' ' x In — > R j ix...x.j m f or following multilinear 
systems: 

• B »i x » 2 y = b where B G R IxJxK , x G R 1 , y G R J and tet 

• M* 2 X* 2 Y = 6 where M G M. IxJxKxL , X G R KxL , Y G R lxj and del 
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• M *2 X * 3 y = B where M € 



pIxJxKxLxMxN 



X G 



pMxNxO 



ye 



pKxLxO 



and B G 



tlx J 



Multilinear systems model many phenomena in engineering and sciences. In the field of continuum 
physics and engineering, isotropic and anisotropic elastic models are multilinear systems |34) . For example, 

C * 2 E = T 

where T and E are second order tensors modeling stress and strain, respectively, and the fourth order tensor 
C refers to the elasticity tensor. Multilinear systems are also prevalent in the numerical methods for solving 
partial differential equations (PDEs). To approximate solutions to PDEs, the given continuous problem 
is typically discretized by using finite element methods or finite difference schemes to obtain a discrete 
problem. The discrete problem is a multilinear system with finitely many unknowns. 

4.1 Poisson problem with multilinear system solver 

Consider the two-dimensional Poisson problem 



~V 2 v = f 
u = 



in Q, 
on r 



(4.1) 



where £1 = {(x, y) : < x, y < 1} with boundary T, / is a given function and 

^ 2 d 2 v d 2 v 
dx 2 dy 2 



We compute an approximation to the unknown function v(x,y) in (4.1). Several problems in physics and 
mechanics are modeled by (4.1) where the solution v represent, for example, temperature, electro-magnetic 



potential or displacement of an elastic membrane fixed at the boundary. 

The mesh points are obtained by discretizing the unit square domain with step sizes, Ax in the re- 
direction and Ay in the y-direction; assume Aa; = Ay for simplicity. From the standard central difference 
approximations, the difference formula, 



2v, 



Ax 2 



+ 



Ay~ 2 



f(xi,y m ), 



is obtained. Then the difference equation (4.2) is equivalent to 

A N V + VA N = (Aa;) 2 F 

where 



(4.2) 



(4.3) 



k-N 



2 -1 
-1 2 









(4.4) 



"11 V W 
«21 V22 

VN1 ■ ■ ■ 



VlN 



VN-1N 
VNN-1 v NN 



and 



in /12 

/21 /22 



TIN 



'■ '■ '■ /n-in 
fm ■ ■ ■ Jnn-i Inn 



(4.5) 



where the entries of V and F are the values on the mesh on the unit square where (xi,yj) = (iAx,jAx) G 
[0, 1] x [0, 1]. Here the Dirichlct boundary conditions arc imposed so the values of V are zero at the boundary 
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of the unit square; i.e. Vio — vin+i = Vo,j 
vectorized which leads to the linear system: 



Vtj+ij = for < i, j < N + 1. Typically, V and F are 



(4.6) 





An + 2In —In 











fu 


Anxn ■ v — 


—I n A n + 2I N 






Vl2 


= (Ax) 2 


hi 






In 








Inn 







—In An + 2In 




VNN 





In |17j . the Poisson's equation in two-dimension is expressed as a sum of Kronecker products; i.e. 

Anxn = In ® A N + A N ® In 

The discretized problem in three-dimension is 

(A N <E> In <8> In + In ® A N <E> In + In <£> In <E> A n ) ■ wec(V) = vec(F). 



(4.7) 
(4.8) 



High dimensional Poisson problems are formulated as sums of Kronecker products with vectorized source 
term and unknowns. 



4.1.1 Higher-Order Tensor Representation 



The higher-order representation of the 2D discretized Poisson problem (4.1) is 

A N * 2 V = F 



(4.9) 



where An G 



vNxNxNxN 



and matrices, V and F, are the discretized functions v and / on a unit square 



mesh defined in (4.5 ). The non-zeros entries of the matrix slice An^'jd € 



pNxN 



are the following: 



(Ax) 2 



(\ (3,4) \ 

( a ( 3 . 4 ) \ - -1 

I A (3,4) \ 
( A N fe=Q!;=/3 ja+l,,3 

( A N fc=ct!;=(3 )a,/3-l 

I A (3,4) \ 
l A N fc=Q!;=(9 jQ,^ + l 



(Ax) 2 
-1 
(Ax) 2 
-1 
' (Ax) 2 
-1 
(Ax) 2 



(4.10) 



for a, ft = 2, ... , N—l. These entries form a five-point stencil; see Figure[2j The discretized three-dimensional 
Poisson equation is 



A N * 3 V = T 



(4.11) 



where A N € R^xNxNxNxNxn and Vj jr g p jv <.\ ,\ 

Similarly, the entries of the subtensor slice (An )i 
i.e. 



(4,5,6) 



containing the values on the discretized unit cube. 

of An would follow a seven-point stencil; 



tNxNxN 



((An) 
((An) 
((An) 
((An) 
((An) 
((An) 
((An) 



4.5 



4,5,6) v 

a,m=/3,n=~/)a-l,l3,J 
) 



(Ax) 



4,5,6) 

=a,m=/3,n=7/'«+l.P:7 
4,5,1 

4,5,6) \ 

4,5,6) v 
=Q,m=^,n=7M,P,7-l 
4,5,6) 



(Ax) 3 

-1 
(Ax) 3 

-1 
(Ax) 3 

-1 
(Ax) 3 

-1 
(Ax) 3 

-1 
(Ax) 3 



(4.12) 



for a, j3, 7 = 2, . 



,N—1 since Uyfc satisfies 

6Uy& " Vi-ijk ~ Vi + ijk — Vij-ik — Uij+lfc — Vijk-l — Vijk+l = (Ax) 3 fijk- 



Ui-ljk 



i+ljk 



Multilinear systems like (4.9) and (4.11 ) are the tensor representation of high dimensional Poisson problems. 
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(a) 5-point Stencil (b) 7-point Stencil 



Figure 2: Stencils for higher-order tensors. 



4.2 Iterative Methods 

Here we discuss some methods for solving multilinear system. A naive approach is the Gauss-Newton 
algorithm for approximating A^ 1 through the function, 

g{X) = A* 2 X -T = 



where I is the identity fourth-order tensor defined in (3.6) and X is the unknown tensor. This method is 



highly inefficient due to a very expensive inversion of a Jacobian. 

To save memory and operational costs, we consider iterative methods for solving multilinear systems. 
The pseudo-codes in Table [I] describe the biconjugate gradient (BiCG) method for solving multilinear 
system, A*2 X — B, without matricizations. Recall that the BiCG method requires symmetric and positive 
definite matrix so that the multilinear system is premultiplied by its transpose A T which is defined in 
Section 3. The BiCG method solves multilinear system by searching along Xk — X^-i + ak-\Pk~i with a 
line parameter a^-i and a search direction V^-i while minimizing the objective function <p(Xk + Ok-iVk) 
where 4>{X) — \X T * 2 A *2 X — X T * 2 B. It follows that <fi(X) attains a minimum iteratively and precisely 
at an optimizer X where A *2 X = B. 

The higher-order Jacobi method is also implemented for comparison. The Jacobi method for tensors is 
an iterative method based on splitting the tensor into its diagonal entries from the lower and upper diagonal 
entries. In Figure [3j we approximate the solution to the multilinear system ( |4.9[ ) using two multilinear 
iterative methods: higher-order biconjugate gradient and Jacobi methods. See Table [l] for the pseudo-codes 
of the algorithms. In Figure [3] BiCG converged faster than Jacobi with fewer number of iterations. The 
convergence of Jacobi is slow since the spectral radius with respect to the Poisson's equation is near one 
|17| . The approximation in Figure [3] is first order accurate. 

Formulating the discretized Poisson equation in terms of higher-order tensors is convenient since its 
entries follow a stencil format in Figure [2] The boundary conditions are easily imposed without rearrange- 
ments of entries. Also the unknown v is solved on a higher-order mesh; no vectorization is needed. The 
multilinear system representation has the potential to become a reliable solver of PDEs in very high dimen- 
sion. For example, implementation of new tensor decompositions which reduce the number of tensor modes 
are required for higher dimensional problems. The use of low rank preconditioner in tensor format can 
dramatically increase convergence rates in iterative methods as in the case for sparse linear large systems 
0- 



5 An Eigenvalue Problem of the Anderson Model 

The Anderson model, Anderson's celebrated and ultimately Nobel prize winning work [1], is the archetype 
and most studied model for understanding the spectral and transport properties of an electron in a disordered 
medium. In 1958, Anderson [T] described the behavior of electrons in a crystal with impurities, that is, when 
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(a) Higher-Order Biconjugate Gradient 



(b) Higher-Order Jacobi 



Higher-Order Biconjugate Gradient Method 
Given A 6 K' ,1(fl>!J,1<JV 



, tol 



Initial guess Xa e 
A _ A T * 2 A 
B f— A T *a B 
X — Xq 

n = B-A*- 2 X 
V — Tl 

while ||7?.|| > tol 
t - (72., 72.) 
a 



y.Residual 
XSearcL Direction 



^Compute Search Parameter 
7.Update Solution 

7,Update Residual 



X ^'X + aV 

tz <- n - ocA h v 

r 

Vi—IZ + PT' '/.Compute Search Direction 
end 



Higher-Order Jacobi Method 



Given A £ 



S- 1 



'',MAX 



for k = 1 to MAX 
for i = 1 to N 
for j = 1 to N 

end 
end 
end 



/(A 



Table 1: Psuedo-codes for Iterative Solvers. 




(a) Aprroximated Solution (b) Bicongugate Gradient (blue -.-) and Jacobi (red — ) 

Figure 3: A solution to the Poisson equation in 2D with Dirichlet boundary conditions. 
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electrons can deviate from their sites by hopping from atom to atom and are constrained to an external 
random potential modeling the random environment. This is called the tight binding approximation. He 
argued heuristically that electrons in such systems result in a loss of the conductivity properties of the 
crystal, transforming it from conductors to insulators. 

5.1 The Anderson Model and Localization Properties 

The Anderson Model is a discrete random Schrodinger operator defined on a lattice 7L d . More specifically, 
the Anderson Model is a random Hamiltonian on £ 2 (Z d ), d > 1, defined by 

H U = -A + XV U (5.1) 

where A(x,y) = 1 if \x — y\ — 1 and zero otherwise (the discrete Laplacian) with spectrum [—2d, 2d] and the 
random potential = {V^^x), x € Z d } consists of independent identically distributed random variables on 
[—1, 1] which we assume to have bounded and compactly supported density p. The disorder parameter is 
the nonnegative A > 0. The spectrum of H u can be explicitly described by 

a(Hu) = cr(-A) + A supp(/9) = [-2d, 2d] + A supp(p). 



Remark 5.1 The random potential V u is a multiplication operator on £2(2 ) with matrix elements V u (x) — 
v x {uj) where (v x (uj)) xeIi d is a collection of (i.i.d.) random variables with distribution p indexed by Z d . 

The random Schrodinger operator model disordered solids. The atoms or nuclei of a crystal are distributed 
in a lattice in a regular way. Since most solids are not ideal crystals, the positions of the atoms may 
deviate away from the ideal lattice positions. This phenomena can be attributed to imperfections in the 
crystallization, glassy materials or a mixture of alloys or doped semiconductors. Thus to model disorder, a 
random potential V u perturbs the pure laplacian Hamiltonian (—A) of a perfect metal. The time evolution 
of a quantum particle ip is determined by the Hamiltonian if w ; i.e. 

xl>(t) = e ltH ^ . 

Thus the spectral properties of H u is studied to extract valuable information. The localization properties of 
the Anderson Model are of interest. For instance, the localization properties are characterized by the spectral 
properties of the Hamiltonian H^; see the references [351 ED ED. ■ The Hamiltonian H u exhibits spectral 
localization if has almost surely pure point spectrum with exponentially decaying eigenfunctions. 

Remark 5.2 Recall from \37l for any self-adjoint operator H, the spectral decomposition is 

a(H) = a v {H) U a ac (H) U a sc (H) 

corresponding to the invariant subspaces H p of point spectrum, H ac of absolutely continuous and H sc to 
singular continuous spectrum. 

The localization properties of the Anderson model can be described by spectral or dynamical properties. 
Let J Cl. 

Definition 5.3 We say that exhibits spectral localization in I if H u almost surely has pure point spec- 
trum in I (with probability one), that is, 

CT(-ff w ) n / C cr p (i? w ) with probability one 

Moreover, the random Schrodinger operator H u has exponential spectral localization in I and the eigenfunc- 
tions corresponding to eigenvalues in I decay exponentially. 
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Thus if for almost all w, the random Hamiltonian H u has a complete set of eigenvectors (i/v«)«eN in the 
energy interval / satisfying 

\1> u ,n(*)\ < C u , n e-^- x «>"\ 
with localization center x u>n for jj, > and C Wj „ < oo, then the exponential spectral localization hold on I. 



Remark 5.4 Let V : l-i^I?) — > ^(Z) fee a multiplication operator and suppose v : Z — > R is a function. 
Then, Vf(x) = v(x)f{x) and thus, o~(V) — range(v). Suppose f{x) is the Dirac delta function; i.e. 



1 X = Xq 

so, 



/(x) = S(x - x ) 

then V f(x) — v(xo)f(x) which implies that cr(V) = a p (V); i.e. V has a pure point spectrum. 




Figure 4: One-dimensional Eigenvectors of the Discrete Schrodinger Operator (-x-) and the Anderson Model 
(black, -o-) for various modes. 



Definition 5.5 A random Schrodinger operator has strong dynamical localization in an interval I if for all 
q > and all 4> € ^(2^) with compact support 



E 



sup || \X^e- %tM "xi(S u M J < oo 



t 



where xi is an indicator function and X is a multiplicative operator from ^2(Z d ) — > £2(2^) defined as 
\X\if> = \x\1>(x). 

Dynamical localization in this form implies that all moments of the position operator are bounded in time. 

As noted before, the Anderson model is a well-studied subject area for understanding the spectral and 
transport properties of an electron in a disordered medium, thus there are numerous results in both physics 
and mathematics literature; see [25] and the references therein. Mathematically, localization has been proven 
for the one-dimensional case for all energies and arbitrary disorder A. For example, Kunz and Souillard 
|33j have proven in 1980 for d — 1 and nice distribution p that localization is always present for any small 
disorder A. In 1987, Carmona et al [B] generalized this result in d = 1 for any distribution p. In any d 
dimension, for all energies and sufficiently large disorder (A >> 1), localized states are present. For d = 2 
and for Gaussian distribution, it is conjectured that there is no extended state for any amount disorder A 
similar to the results for d = 1. For d > 3, there exists Ao > such that for A < Ao, H has pure absolutely 
continuous spectrum. It is known (see [25]) that there exist Ai < 00 such that for A > Ai, H\ has dense 
pure spectrum. There are still many open problems like the extended state conjecture [2"T] . 
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(a) A = .1, N = 100 



Figure 5: One-dimensional Eigenvectors of the Discrete Schrodinger Operator (-x-) and the Anderson Model 
(black, -o-) for various modes. 



5.2 Approximation of Eigenvectors 

To approximate the eigenvectors of the multidimensional Anderson model, the eigenvalue decomposition in 



Theorem 3.14 is applied to the Hamiltonian H^. The Hamiltonian H u in two and three dimensions are 



formed into fourth- and sixth-order tensors using the same stencils in Figure [2] corresponding to the entries 



in (4.101 and (4.12|, respectively. The only main differences are that the center nodes are centered around 



zero and have random entries, 

and 

where a and r are random numbers with uniform distribution on [—1, 1] accounting for the random diagonal 



potential V u . With the formulations of the Hamiltonian like (4.7) and ( |4.8[ ), the uniform distribution on 
[— 1, 1] on the random potential cannot be guaranteed. But the higher-order tensor representation easily 
preserved this structure. To numerically compute the higher-dimensional eigenvector, tensor representation 
of the Hamiltonian is necessary before the appropriate Einstein product rules and mappings are applied. 

In Figures |4j [5j [6] |7] and [HJ the eigenfunctions are approximated by the eigenvectors from the both 
discrete Schrodinger and random Schrodinger (Anderson) models. In Figures [4] and [5j the eigenvectors of 
the Anderson Model in one dimension are definitely more localized than the eigenvectors of the discrete 
random Schrodinger model in one dimension which are consistent with the results in |25] for the Anderson 
model in one dimension. Observe that for large amount of disorder (e.g. A = 1), the localized states are 
apparent. However this is not true for smaller amount of disorder (e.g. A = .1). The localization is not so 
apparent for A = .1 for N = 50, but when the number of atoms is increased, that is, setting N = 100, the 
localized eigenvectors are present as in the case when A = 1; see Figures [4] (part b) and [5] 

In the contour plots of Figures [6j [7] and [HJ the eigenvectors in two and three dimensions of the Anderson 
model are more peaked than those of the nonrandomized Schrodinger for large disorder A > 1. As in the case 
for one dimension, localization is not apparent for small disorder (A = .1) as seen in Figure [6] Moreover, 
as N increases, for small disorder the eigenstates of both discrete Schrodinger and Anderson models seems 
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(a) N=29 




(b) N=48 



Figure 6: Two-dimensional Eigenvectors of the Discrete Schrodinger Operator (left column) and the An- 
derson Model (right column) for varying disorder (A = 10 (top), A = 1 (middle) and A = .1 (bottom)). 
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Figure 7: Factors of the Multilinear SVD Decomposition 14J of the Two-dimensional Discrete Schrodinger 
Operator (right column) and the Anderson Model (left column) for varying disorder (A = 10 (top), A = 1 
(middle) and A = .1 (bottom)). 



to coincide. This does not necessarily mean that the localization is absent for this regime, but rather the 
localized states are harder to find for small amount of disorder and a larger amount of atoms have to be 
considered. In Figure [7J localization is not clearly visible for even A = 1 in the factors calculated via the 
multilinear SVD decomposition [T3] while localization is detected in the plots in Figure [6] when A = 1. The 
plots in Figure [7] are generated by applying the HOOI algorithm [15] to the Hamiltonian tensors ( 5.2|5.3 ). 

Our numerical results provide some validation that these localizations exist for large disorder for dimen- 
sion d > 1 for sufficient amount of atoms. 



6 Multilinear Least Squares 

Under the Einstein product rule, odd-order and nonhyper-rectangular tensors do not have inverses. In this 
section, we extend the concepts of pseudo-inversion for odd-order tensors and and nonhyper-rectangular 
tensors. 



6.1 Least-Squares 

The linear least-squares (LLS) method is a well-known method for data analysis. Often the number of 
observations b exceed the number of unknown parameters x in LLS, forming an overdetermined system, 
e.g. 

Ax = b (6.1) 
where A £ R mx ™, x € W 1 , and b G K m with m > n. Through minimization of the residual, r = b Ax, 



the overdetermined system (6.1) can be solved. If the objective function being minimized over x € K™ is 
0(x) = ||r||^ 2 , then this is the least-squares method. Thus, the solution obtained through LLS is the vector 



x* G R n minimizing </>(x); the vector x* is called the least-squares solution of the linear system (6.1 1 
Here are examples of overdetermined multilinear systems. 

(i) A » 3 x = B where A G R IxJxK , x G R K and B € R IxJ 



(ii) A * X = B where A G WL IxJxRxS , X G i BxSxKxL and B G wixJxKxL 
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(a) A = 10 



(b) A = 1 



(c) A = 0.1 






(d) A = 10 



(c) A = 1 



(f) A = 0.1 



Figure 8: Two views (front (first row) and top (second row)) of the Three-dimensional Eigenvectors of the 
Anderson Model (left) and the Discrete Schrodinger Operator (right) for varying disorder. 



For both cases, higher-order tensor inverses of A do not exist. The formulations, 

min \\A »3 x — Bllf and min \\A* X — B\\p i 

x X 



(6.2) 



are considered to find multilinear least-squares solutions of systems. Note that the Frobenius norm, |] • \\p, 
is defined as = 



„_„..,„ K l2 ..., JJV | 2 for A** 1 **" 1 * 



6.2 Normal equations 

Definition 6.1 (Critical Point) Let 

point of cf> is a point x G K™ such that 



Consider the multilinear system, 



— ► K be a continuously differentiable function. A critical 
V</>(x) = 0. 



A » 3 x = B 



where A G R IxJxK , x € R K , and B G R lxj and define 

0i(x) = \\A» 3 x-B|||. 
Lemma 6.2 ^4n?/ minimizer x G o/(/>i satisfies the following system 

A T * 2 A» 3 x = _4 T * 2 B. 

Proof. We expand the objective function, 

<£i(x) = (i. 3 x-B,i. 3 x-B) = (i. 3 xM. 3 x)-2(i» 3 x,B) + (B,B) 

= (^.3x) T (^. 3 x)-2B T (^.3x) + B T B. 



(6.3) 
(6.4) 
(6.5) 
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Then, 



V0i(x) 



9x 



[(^. 3 X) T (^. 3 X) 



d_ 

d_ 
dx" 



»j V kl I 



_9_ 

dx" 



A'/ 



kl*l 



L kl 



= 2(A T * 2 A)x 



= 2A T * 2 A » 3 x 



(6.6) 



and 



<9x 



2 ■;- [(„4. 3 x) T B] 



Ox 
<9x 



2.4 T * 2 B 



(6.7) 



where A T , a permutation of A where (A T )kij — {A)ijk- Thus from ( 6.6|6.7 ), 



a 



- (x) = 2^ T * 2 Ai 3 x- 2.4 T * 2 B. 



Clearly, the minimizer x of <f>\ satisfies 

A T * 2 A » 3 x = „4 T * 2 B. 

Furthermore, the critical point is x = (_A T *2 -4) -1 *2 *2 B. 



□ 



For the problem 

where Ae R IxJxRxS .X e 



A* 2 X = B 

^xSxKxL and B £ R /xJxJfxi and the ob j ective f unc tion, 



02 (X) 



\\A *2 X 



II, 



we have the following lemma. 
Lemma 6.3 Any minimizer X G 



ifixSxKxi 



o/ 02 satisfies the following system 



A T * 2 A* 2 X = A T * 2 B 



(6.8) 



(6.9) 



where A T £ 



pB,xSxIxJ 



denotes the transpose of A £ 



pIxJxRxS 



Moreover, the critical point of cj>2 is 



X = (A T * 2 A)- 1 * 2 A T * 2 B. 



Remark 6.4 We omit the proof for Lemma \ 6.3\ since it is similar to that of Lemma \ 6.2\ Both cr itica l 
points, x = (A T *2 A)^ 1 *2 A T * 2 B and X — (A T *2 A) _1 *2 A T *2 B are unique minimizers for (6.4) 



and \6.8\ ), respectively, since <f>\ and $2 are quadratic functions. Equations (6.5) and \6.yy are called the 
high- order normal equations. 



6.3 Transposes and permutations 



From Definition 3.8 the transpose of A € 



alxJxRxS 



in (6.9 1 is easily obtained. Since the definition only 



holds fo r ev en-order tensors, we extend the notion of transposition to third (odd) order tensors. Recall in 
Lemma 6.2 we have denoted a permutation of A 6 l fx xK as A T € M. KxIxJ . The transpose of a third 
order tensor A G ^ Ix J xK } s a permutation since third order tensors can be viewed as fourth order tensors 
with one mode in one-dimension. For example, if B is a permutation of A € ^ IxJxKxL with L = 1 , then 
bkUj — o-ijki which is bijki = a>kiij = a>kij- Thus we denote B = A T where B £ ~R KxIxJ . 

Unlike in the matrix case where (A T ) T = A, for third order tensors we have the following property. 
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Lemma 6.5 (Property of third order tensor transpose) Let A € R IxJxK and p be a permutation 
on the index set {ijk}. Then ((A T ) T ) T = A. 

Proof. For the index set {ijk}, there are two cyclic permutations: pi(ijk) = jki, p2(jki) — kij, psikij) = 
ijk and pi(ikj) = kji, p~2{kji) = jik, p~z{jik) = ikj. It follows that {{{A T ) T ) T ) l]k = (f)ijfe = (2 ? ) P3 (feij) 
((A T ) T )kij = (p)uj = (V)p2(jki) => {A T )jki = {T>)ju = (^)pi(ijk) => (A) ijk = ip)i k- □ 

There are six permutations for a third order tensor, although there are two cyclic permutations. For an TVth 
order tensor, the number of tensor transposes is dependent on the number and length of cyclic permutations 
on the index set \i\ii ■ ■ ■ In}- Table 2 lists all the possible multilinear least squares problems for third order 
tensors and their corresponding tensor transposes. 



A 


x 


B 


A 1 


ml xJxK 


R K 


R lXj 


R K xlxj 


r jxk xi 




R JXK 


R l XJXK 


R K xlxj 


R J 


R KX1 


R JxKxl 


R i xk x J 


R J 


R Kxl 


R Jxl xK 


R K XjXl 


R 1 


R KXJ 


R l xKxj 


R J x 1 x K 


R K 


R Jxl 


R K xjxl 



Table 2: Dimensions for Higher-Order Normal Equations for Third Order Tensors 
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Appendix: Proof of Theorem 3.5 



Proof. Here we prove the main theorem by checking each axioms (Al — A3) hold in Definition 3.3. 

(Al) Show that the binary operation * 2 is associative. 

Since we know that / is a bijective map with the property that f(A*2 B) — f(A) ■ f(B). We will show 
/- 1 (A-B) = /- 1 (A)* 2 /- 1 (B),for A.BeM. 

Let A,B,C e T and A, B, C e M where /(A) = A, /(B) =B and /(C) = C. Then, 

(A* 2 B)* 2 C = r 1 (A)* 2 /- 1 (B)* 2 r 1 (C) = r 1 (A.B.C) = r 1 (A.(B-C)) 
= f~\A) * 2 f-\B -C) = A* 2 (/ _1 (B) * 2 r\C)) = A* 2 (B * 2 C) 

Therefore, (A * 2 B)* 2 C = A * 2 (B * 2 C). 

(A2) Show that there is an identity element for * 2 on T. 

since € M is the identity element in the group. Note that we will suppress the superscript 

of I in the calculation below. Then we claim that / _1 (I) is the identity element for * 2 on T. 

For every element A € T, there exists a matrix /leMso that /~ X (A) = A. So, we get 
A *2 rHl) = / _1 (A) *2 /^(I) = r\A ■ I) = f-\A) = A 

Similarly, 

r\i) *2A = r x (i) * 2 r\A) = r\i ■ a) = /- \a) = a 

Therefore, A * 2 = /"H 1 ) *2 A = A. 
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Define the tensor £ as follows 

(£)iii 2 Jlj2 = ^iljl ^iaja 

where 



5ik = 



1, / = k 
0, Z ^ k 



We claim that £ = / 1 (I). By direct calculations, we have 

(£ *2 ^titajija = ^ ] £jii2UV a uvjij2 = e 'ii'(2Ui2 a iii2jij2 = ^iiii ^12*2 a iii2ji 02 = a iiii]\ii = "^iiiajia 



and 

(./4 *2 £)iii 2 jij2 = ^ ] a i 1 i 2 uv e uvj 1 j 2 = a iii 2 jija e ii «2j'i J2 = a ii *2 ji ja ^ji Ji ^Jaja = a i\iihh = • / ^»i»2jij2 ■ 

Thus £ *2 -4 = -4 *2 £ = -4, for V.4 G T. Therefore £-i 1 i 2 j 1 j 2 — fii 1 j 1 di 2 j 2 is the identity element for * 2 
on T. 

Finally we know that /-l(I^^x/i/ 2 ) = g 
(A3) Show that for each A&T, there exists an inverse A such that A*2A = A*2A = E. 



We define A — f 1 {[f(A)] 1 } since f(A) € M and / 1 is a bijection map from Lemma (3.4). Then, 
f(A* 2 A) = f(A) ■ f(A) = {f (AT 1 ■ f(A) = 



From Lemma 3.4 and since /(£) = I /l/2X/l/2 j we obtain A *2 A = £. 
Similarly, we can get A *2 A = £. 

It follows that for each A ET, there exists an inverse A such that A *2 A = A *2 A = £. 



Therefore, the ordered pair (T, * 2 ) is a group where the operation * 2 is defined in (2.2). In addition, the 
transformation / : T — > M (3.1 ) is a bijective mapping between groups. Hence, / is an isomorphism. □ 
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